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Abstract 

In this part, we consider the capacity analysis for wireless mobile systems with 
multiple antenna architectures. We apply the results of the first part to a commonly 



'sj" ■ known baseband, discrete-time multiple antenna system where both the transmitter 

O _ 

I and receiver know the channel's statistical law. We analyze the capacity for addi- 

O ■ 

tive white Gaussian noise (AWGN) channels, fading channels with full channel state 
^ I information (CSI) at the receiver, fading channels with no CSI, and fading channels 

' with partial CSI at the receiver. For each type of channels, we study the capacity 

value as well as issues such as the existence, uniqueness, and characterization of the 
capacity-achieving measures for different types of moment constraints. The results are 
applicable to both Rayleigh and Rician fading channels in the presence of arbitrary 
line-of-sight and correlation profiles. 

Index Terms 

Capacity, capacity-achieving measure, channel state information, fading, multiple 
antenna, Rayleigh, and Rician. 
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1 Introduction 



Multiple antenna architectures have an increasingly important role to play in emerging wire- 
less communication networks, particularly at base stations in cellular systems. Indeed, when 
used in conjunction with appropriately designed signal processing and coding algorithms, 
such architectures can dramatically enhance the performance of wireless mobile systems. 
However, in order to make efficient use of the resources, it is necessary to understand the 
fundamental limits of these systems. 

Because of the time-varying nature of the channel in wireless mobile systems, the chan- 
nel state (realization) is changing over time, called as fading, that imposes new challenges 
in determining the capacity of the channel. For this purpose, it is essential to know how 
much knowledge we have about the channel states either at the transmitter or at the re- 
ceiver. In practice, depending on the application, we might have a range of scenarios from 
no channel state information (CSI) to full CSI. Hence, the capacity analysis and optimal 
coding strategies for different CSI scenarios could be quite different. For example, if full 
CSI is available at both the transmitter and the receiver, then the capacity-achieving input 
distribution is Gaussian, and the optimal encoder employs a power adaptation algorithm 
(water pouring) [1], [2], [3], [4]. In contrast, in the presence of full CSI at just the receiver, 
the capacity-achieving input distribution is Gaussian [5], [6], but the encoder uses the same 
average power over all time instances. This scenario is well investigated for multiple antenna 
channels in the presence of i.i.d. Rayleigh fading [7], [8], and recent interests in this area 
include determining the capacity and the capacity-achieving measures in the presence of 
arbitrary correlation and hne-of-site fading components [9], [10], [11], [12], [13], [14]. 

Unlike these scenarios, the capacity and capacity- achieving distributions for fading chan- 
nels in the absence of CSI, such as applications where the fading changes rapidly, are generally 
unknown even for the case of single-input single-output (SISO) channels. For multiple-input 
multiple-output (MIMO) channels with no CSI, Hochwald and Marzetta [15] have addressed 
the capacity problem for Rayleigh channels under certain assumptions on the SNR regime 
and on the ratio of the number of transmitters to the coherence time of the channels. For 
SISO channels, [16] was the first rigorous result in this area that addressed the characteri- 
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zation of the capacity- achieving input distribution (subject to an average power constraint) 
for Rayleigh channels. Unhke Rayleigh channels, the capacity and capacity-achieving input 
distributions for Rician channels in the absence of CSl arc barely touched. For low signal- 
to-noise ratio, [17] showed that the capacity- achieving input distribution (subject to second 
and fourth moment constraints) is discrete. Asymptotic upper bound and lower bound for 
the capacity of fading channels are derived in [18], subject to a maximum-power constraint. 
More results can be found in [19], and [20]. 

In this part, wc use the results of Part 1 and study the capacity problem of MIMO 
channels in a unified manner, irrespective of the type of fading, the correlation profile, and 
the amount of available knowledge about the CSI at the receiver. More precisely, we study the 
capacity problem for AWGN channels, fading channels with full CSI at the receiver, fading 
channels with no CSI, and fading channels with partial CSI at the receiver. For each type of 
channels, we investigate its capacity as well as issues such as the existence, uniqueness, and 
characterization of the capacity- achieving measures of multiple antenna channels subject to 
different types of input moment constraints. The organization of this paper and a summary 
of our contributions are as follows. 

In Section 2, we introduce the multiple antenna system setup. In Section 3, we address the 
capacity analysis for additive white Gaussian noise (AWGN) channels. Let n and m denote 
the number of transmit and receive antennas, respectively, and let X — C"' and Y — C" 
denote the input and output alphabets of the channel. Suppose the channel realization 
is described by ^ e C"*^". For moment constraints of type E (||x||^) (1 < 77),^ we show 
that capacity-achieving measure, P„, exists uniquely. If 77 > 2, we show that the capacity- 
achieving measure has a bounded support with no interior point. In contrast, if 1 < 77 < 2, 
then a necessary condition for Po is that for sufficiently large y in the column space of H, 
supr>o e~''^Po( ||y — Hx\\2 < ra) — 0(e~"ll^ll'') for some a > 0. For the case of 77 = 2, we 
also derive the capacity- achieving measure for these channels using Kuhn- Tucker conditions, 
where the result is the same as previously known results in [8] . 

In Section 4, we address the capacity analysis of MIMO fading channels with full CSI at 
the receiver for Rayleigh or Rician channels with arbitrary correlation profile. For moment 

^Refer to [21] for definition of ry-norm. 
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constraints of type E (1 < 77), we show that capacity-achieving measure, Po, exists 

uniquely. If 77 > 2, we show that the capacity-achieving measure has a bounded support with 
no interior point. In contrast, if 1 < ?7 < 2, then a necessary condition for Po is that for almost 
every channel side information v — C"^**, supj.^Qe~^^ Po{\\y—vx\\2 < ra) — 0(e~"^''^ll^ll^) 
for sufficiently large y in the column space of v and for some positive function a :V ^ R"^. 
For 77 = 2, we fully characterize the capacity-achieving measure for these channels, where 
our results reduces to the results of [8] for the case of isotropic Rayleigh channels. 

In Section 5, we address the capacity analysis of MIMO fading channels with no CSI at 
the receiver for Rayleigh or Rician channels with arbitrary correlation profile. For moment 
constraints of type E (||x||^) (1 < 77), we show that capacity-achieving measure, Pg, exists 
uniquely. If 77 > 2, we show that the capacity-achieving measure has a bounded support with 
no interior point. In contrast, if 1 < 77 < 2, a necessary condition for the capacity- achieving 
measure is 

PoiWvh < \\x\\2,n{y) c n{x)) = o(e-"W), 

where TZ{-) denote the row space of a matrix, and a > is a constant. 

In Section 6, we address the capacity analysis of MIMO fading channels with partial CSI 
at the receiver. We consider a certain class of estimators where the channel side information 
is jointly Gaussian with the channel realization. For moment constraints of type E 
(1 < 7;), we show that capacity-achieving measure, Po, exists uniquely. If r/ > 2, we show that 
the capacity-achieving measure has a bounded support with no interior point. In contrast, 
if 1 < 77 < 2, a necessary condition for the capacity- achieving measure is 

where 7^(-) denotes the row space of a matrix, V denotes the state information space, Xra.m{v) 
denotes the minimum eigenvalue of the covariance of channel realization conditioned on v, 
and q; : y — > R"^ is a positive function. 

Finally, Section 7 states some concluding remarks along with some directions for future 
research. 
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2 General System Model 



We assume a wireless communication system employing n transmit and m receive antennas, 
where the baseband model of the channel is described by a discrete-time model as follows. 
For each pair of transmit and receive antennas, (s, r), the path from transmit antenna s to 
receive antenna r is represented by a complex symbol hrs called the path gain. Let H G C"^" 
denote an mxn matrix with /i^s's as its entries which is known as the channel state or channel 
realization? Correspondingly, we denote the space of all possible states as iif = C"*^". 

We assume a block channel model where in each block the channel is used L >1 times 
which is called the block-length. In each channel use, all antennas are used simultaneously, 
and n complex symbols arc transmitted through n transmit antennas. We assume the channel 
is governed by a linear statistical model as follows. 

Let X — C"'^^ and Y — C"^^ denote the input and output alphabets, respectively. At 
each block k, a matrix Xk & X is transmitted through the transmit antennas and a matrix 
yk eY is received in receiver which is described by 

Vk = HkXk + Zk, (1) 

where G C"*^^ denotes the additive noise and at block k. We assume that the noise 
matrices are temporally independent, with identically distributed (i.i.d.) complex normal 
entries which have zero mean and variance a^, i.e., CJ\f{0, a^). We assume that the channel 
state remains unchanged during each block, but it might change after each block. We 
assume that at each block, there exists an element Vk available at the receiver that gives 
some information about the channel state Hk. This enables us to consider a broad range 
of channel state information (CSI) scenarios from no CSI where Vk and Hk arc statistically 
independent, to full CSI where conditioned on Vk there is no uncertainty about Hk- We 
assume that Vk belongs to a Borel-measurable space V and there exists a joint measure 
Q o R on H xV and the input alphabet X is statistically independent from H xV. Note 
that to fully characterize the statistical properties of the channel, we need to specify the joint 

^To clarify any confusion which might be raised by this notion in comparison to our notion in Part I, one 
should note that channel realization matrix, H, denotes the state of the channel which was previously shown 
as s. The reason of this change is to comply with the common notion in the literature of MIMO channels. 
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probability measure Q o R, however, we postpone this to later sections where we address it 
in different scenarios. 

We assume that there exists a nonnegative continuous function g : X ^ IR+ and a 
positive value F > 0, such that for a /^-length block code, the codewords are chosen to satisfy 
— Motivated by practical scenarios, we consider g{x) — \\x\\^ (1 < ?7 < oo), 
where the most common choice is g{x) — (the Probenius norm) and F is the average 
energy per block. By the Law of Large Numbers [22, p. 325], as K grows to infinity, this 
is equivalent to assuming that the empirical measures of codes are obtained from a set of 
input probability measures which are characterized by a continuous positive function g{x) 
together with a real value F > as follows, 

^,,r(x) = {pe^(x)| jg{x)dP<ry 

Note that for the choice g{-) — (1 < rj < oo),^ one can easily verify that g satisfy the 
hypothesis of Lemma 3.1 of Part I, hence, ^g^r{X) is weak* compact. 

It just remains to specify the generic statistical law that governs the channel, that is to 
describe H). For this purpose, recall that the additive white noise is a complex normal 

random matrix with i.i.d. components. Hence, the channel is described by the conditional 
measure, W{-\x,H), which is absolutely continuous with respect to the Lebesgue measure, 
i.e., H) <^ /ly, with the density function 

f(y\-^H) = -,^e-^. (2) 



Let define an auxiliary measure T as follows. 



1 f a^ip 
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yEeBy, T{E) = ^-^^^^J^e^^d^,y, (3) 

where 1 < p < 2 and < a < 1 are free variables and /3 > is chosen such that T has a 
unit norm. Since (2) is nonzero, one can verify that W{-\x,H) <i; T for all x and H and the 
density function of iy(-|a;, H) with respect to T is described by 

My\x, H) 4 = (3e ^ . (4) 



^Refer to [21] for the definition of r/-norm. 
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3 Additive White Gaussian Noise Channels 



In this section, we consider a class of wireless channels where the physical medium between 
the transmitter and receiver remains unchanged throughout the communication. We assume 
that the channel is governed by a linear model as (1) and the channel state (realization) is 
Hk = H for all blocks, where H is known both at the transmitter and the receiver. Thus, 
we characterize the channel just by the matrix H. To emphasize that a channel is Gaussian, 
we use a subscript "G" and denote the channel by Wg{H). Since H is known, there is no 
advantage of taking L > 1. Hence, throughout this section, we assume that L — 1. 

In the framework of the general system model, we consider that V = H where the 
probability measure QoRonHxVisa point mass measure (Dirac measure) [21] at {H, H) 
that can be explained as follows. Since there is no uncertainty on the channel realization 
with the knowledge of v at the receiver, the conditional probability measure on Jf is a 
point mass measure sA, H — v. Moreover, because the channel state remains unchanged, the 
measure i? on y is a point mass measure aXv = H. As a result, we observe that the channel 
can be simply described by measure Wq^{-\x) <^ T with the density function 



For every P e ^g r(-^), it can be verified that PWq^ <^ T with the density function 



As a result, we simplify the expression of the mutual information (6 of Part I) for the additive 
white Gaussian channels as 



I{P,Wg{H))= fT,QM^)log,fT^Q^iy\x)dTdP- fT,P,QM^og, fT,P,Q^iy)dT. (7) 



Note that both of the terms on the right-hand side (RHS) in (7) are finite. 
3.1 Properties of mutual information 

In this subsection, we address some analytical properties of the mutual information function 
of the Gaussian channels. This includes properties such as strict concavity and continuity of 
the mutual information, which are used in the capacity analysis of channels. 

Recall that Proposition 3.3 of Part I addresses the strict concavity of the mutual infor- 



My\x,H)dQH^Pe ^ 



(5) 




(6) 
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mation in general. Since the strictness property is essential to addressing the uniqueness 
of the capacity-achieving probability measure, simpler arguments are of interest. Using the 
following observation, we address this issue. 

Observation 3.1. For every Pe ^^^(X), fT,p,Qfjiy) is continuous and nonzero overY. 

Proof. By (5), the continuity and positiveness of fT,Qfj{y\x) are obvious. The positiveness 
of fT,Qg{y\x) implies the positiveness of fT,p,Qjj{y)- To prove the continuity, suppose that 
y va. the Euclidean norm. Then, 



if they induce the same output probability measure. Equivalently, two input measures are 
equivalent over Wg{H) if fT,p^,Qij{y) = fT,P2,Qsiy) all yeY. 

Proposition 3.1 (Strict concavity). The mutual information of a Gaussian channel 
Wg{H) is strictly concave with respect to the convex combination of two input measures 
Pi and P2, unless they are equivalent over Wg{H). 

Proof. By Observation 3.1, if there exists y such that fT,Pi,Qfj{y) 7^ fT,P2,Qjjiy)^ then 
there exists a neighborhood Uy G Y oi y with that property. Then, by definition of T (3), 
T{Uy) > 0. This means that the set Uy x {H} complies the requirements of Proposition 
3.3 of Part I. As a result, the mutual information function of a Gaussian channel is strictly 
concave with respect to the convex combination of Pi and P2 if and only if there exists 
y &Y such that fT,Pi,Qg{y) fT,P2,Qii{y)- By definition of equivalency for input probability 
measures, we deduce the assertion. □ 

Another important property of the mutual information in capacity analysis is its weak* 
continuity, which is stated as follows. 




\p,QH^y)- 



This proves the continuity of fT,p,Qfj{y) over Y. 



□ 



We define that two probability measures Pi,P2 G ^g^r{X) are equivalent over Wg{H) 
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Proposition 3.2 (Continuity). The mutual information of any Gaussian channel Wg{H) 
is weak* continuous over J^g^ri^)- 

Proof. It suffices to check if the hypothesis of Theorem 3.3 of Part I is satisfied. We prove 
the theorem for g{x) = \\x\\l, and the proof for other choices of g is also similar. To prove 
hypothesis (a) of Theorem 3.3 of Part I, we proceed as follows. Assume that a < 1 and 
p < 2, which are fixed, and let 

Ac = {(x,y) eXx Y\fT,QM^) |log2/r,Q5(yk)| > c'}. 

Let define 

= {{x,y) eXx Y\fT,Qf,{y\x) > c}. 

For sufficiently large c > 0, e.g. c > 10, it can be observed that C B^. But the condition 
fT,Qg{y\x) > c is equivalent to the condition 

«ll/|l2 - lb - Hx\\l > c^^ln^. 

Hence, if we define 

= {{x,y) eXx Y\a^y\f^ > ^^'in-^}, 
we can deduce that for sufficiently large c > 0, we would have Ac Q Q E^.. Let a;(c) — 

/ 2 \ 1/P 

( ^ In 1 1 . As a result, for every P e ^g^r{X) 

ff fT,QM^)\log,fT,QM^)\dTdP < ff fT,QM^)\log,fT,QM^)\dTdP 
J J Ac J JEc 

< Ii°g2/5| jj^ :^^e~^^d^xydP 

- [-z^^^^i:^)) J J -^d^^ydp 

|log2/^l , log2e 



+ 2 2-.r N ]ima'+ / \\Hxr,dP) 



' u;2(c) a^u;^~P{c) ' ^ mm/ 



Since a;(c) — > oo as c — > oo, we can deduce that 

lim sup / fT,QM^)\^^S2fT,Qjj{y\x)\dTdP=0. 
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This implies that the hypothesis (a) of Theorem 3.3 of Part I holds. To verify the hypothesis 
(b), for a fixed y, let = {x e X : fT,Q^i.y\x) > c}. Then, 

sup / fT,QMx)dP<- sup I {fT,Q^{y\x)fdP 



= — sup / ^ dP 

C Pe&>g,T{X)J 

2 „2||„||P 
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/3 2 " - 

< — e "'^ —^0, as c — > oo. 
c 

Thus, since both hypotheses of Theorem 3.3 of Part I hold, the mutual information is weak* 
continuous. □ 



3.2 Capacity analysis 

In this subsection, wc address issues on the existence, the uniqueness, and the characteriza- 
tion of the capacity- achieving measures for Gaussian channels. 

Lemma 3.1 (Existence). For any Gaussian channel Wg{H) and the set ^g^r{X), there 
exists a measure Pg G J^g^r{X) that achieves the capacity of the channel Wg{H). 

Proof. Proposition 3.2 states the weak* continuity of the mutual information over ^g^riX). 
Since J^g^r{X) is weak* compact, by Proposition 4.1 of Part I, we conclude the assertion. □ 

Lemma 3.1 states the existence of the capacity-achieving measure over Gaussian channels. 
To address its uniqueness, we use an earlier result on strict concavity of mutual information, 
i.e.. Proposition 3.1. 

Lemma 3.2 (Uniqueness). For any Gaussian channel Wg{H) and the set ^g^^iX), the 
capacity- achieving measure is unique up to the equivalency of input measures. 

Proof. Suppose there exist distinct probability measures Po and that achieve the capacity. 
By Proposition 3.1, the mutual information is strictly concave with respect to their convex 
combination. This means that any measure in the form aPo + {1 — a)P^ achieves a higher 
mutual information that is a contradiction to optimality of Po and P*. Thus, Po and P* must 
be equivalent. □ 
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So far, in this subsection, we have shown the existence and the uniqueness of the capacity- 
achieving measure over AWGN channels. It remains to provide some insight to the charac- 
terization of such input measure. 

Proposition 3.3. Let g{-) = \\ ■ ||^ for 1 < 77. Ifrj > 2, the capacity- achieving measure has a 

bounded support with no interior point. In contrast, if 1 < rj < 2, then a necessary condition 
for Po is that for sufficiently large y in the column space of II, sup^>Q e~''^Po(||y — IIx\\2 < 
ra) — 0(e~"ll^ll'') for some a > 0. 

Proof. Suppose PoWq^ is the optimal capacity-achieving output measure. Applying Kuhn- 
Tucker condition. Theorem 4.3 of Part I, to our problem, we need to find a positive value 
7 > such that 

D{Wq^{-\x)\\PoWq^) - ^\\x\\; <C-^T. 
Using some straightforward mathematical manipulation, this results to 

f 1 \\y-fi^\\2 

-mlog^ina e) - J ^^^^^^ e ^og^ fp„Q^{y)dfiY - TlkH^j <C -jT. (8) 

The problem is now finding an output density function together with the value 7 > 
such that the above inequality is satisfied with equality on all x e X in the support of the 
capacity-achieving measure. 

Case f] > 2: We note that (8) has a constant part and a part that depends on x. It can 
be inspected that for large values of x the term 7||a;||JJ is a dominant term. Hence, for large 
values of x to be in the support of Pg, the integral term must have a growth rate of 
otherwise, the support is bounded. As a result, it suffices to study the asymptotic behavior 
(tail) of the density function fp„,Qu{y) = We have 

If \\y-H^\\2 

fpo,QM = J e ~^^dPo 

1 f \\y-Hx\\l \\y+Hx\\l 

> ^ / e ^dPo 

= — / e ^dPo. 

This means that — logs /p„,Q^(y) — ^(Iblli)- a result, it can be verified that for 77 > 2, 
there exists no choice for the input measure so that the integral part of (8) catch up with 
the growth rate of ||a;||^ for large values of x. Thus, the support of the capacity- achieving 
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input measure is bounded for 77 > 2. One can verify that the hypothesis of Proposition 4.3 
of Part I holds for Gaussian channels. More specifically, the function p{z) is analytic on Z 
except possibly at 2; = 0. As a result, wc deduce that Sx{Po) can not have any interior point. 

Case T] <2: Note that every y eY can be uniquely decomposed as y = yfj + ys-^ where 
yjj is in the column space of H and yH± is orthogonal to the column space of H. One can 
observe that the contribution of yg± is on the constant terms of (8) and it does not affect 
our analysis on the terms that depend on x. As a result, for every y e F in the column 
space of ff, we have 

r (J J\\y-Hx\\2<ra 

1 2 
- m 2m ^ ~ ^^Il2 ^ '^^) 

Let k{y) = sup^ e~^^Po{\\y — Hx\\2 < ra). Then, for every y in the column space of we 
deduce that 

-log2/p„,Q^(y) = 0(min(-log2A;(|/), \\y\\l)). 

Two scenarios can be considered. Either the support is bounded or it is not. For the latter 

case to be true, we need to have the integral term of growth rate of \\x\\^. Hence, it is 
necessary to have k{y) = 0(e^"ll^ll^) for sufficiently large y in the column space of H where 
a is a positive real number. We remark that this necessary condition remains valid for the 
other case also. □ 

Proposition 3.3 provide us intuition about the support and the possible behavior of ca- 
pacity achieving measures subject to different choices for 77. However, it is still not possible 
to obtain closed form expressions for the the capacity- achieving measure for the general 
choice of g{-) = \\ ■ (77 7^ 2). However, for the case of 77 = 2, it is possible to charac- 
terize the capacity achieving measure as shown in [8]. Using the observation that among 
distributions with the same covariance matrix, the Gaussian distribution achieves the largest 
relative entropy [23] , Telatar [8] proves that the capacity-achieving input measure must have 
a Gaussian density function. Here, we use Kuhn-Tucker conditions, i.e.. Theorem 4.3 of 
Part I, and provide a different approach toward characterizing the capacity-achieving input 
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measure. 



Theorem 3.1. Let g{-) — 



\. The probability measure in which achieves the 



capacity of the channel Wg{H) is absolutely continuous with respect to the Lebsegue measure 
with a unique (a.e.) Gaussian density function of zero mean and covariance matrix to 

/J, 

and II is selected such that tr (S^) — F. Moreover, the capacity of this channel is 



Proof. Suppose PoWq^ is the optimal capacity- achieving output measure. Applying Kuhn- 
Tucker condition, Theorem 4.3 of Part I, to our problem, we need to find a positive value 
7 > such that 

d{Wq^{-\x)\\PoWq^) - ^\\x\\l <c-^r, 

for every x & X. Using some straightforward mathematical manipulation, this results to 

-mlo&(.a^e) - / leg, 'M^_L_,-i^ _ ,||,||. < c - ,r. (9) 

Now, the problem is to find an output density function together with the value 7 > 
such that the above inequality is satisfied with equality on all x E X in the support of the 
capacity- achieving measure. 

Suppose X e X be in the support of the optimizing input measure. Then, we need to have 
the integration in the above inequality result into a quadratic form. To obtain a quadratic 
form out of integral, an straightforward option is to assume log2 '^^^^^^"^ = — where 
a is a constant value and 6 is an m x m matrix. As a result, we will obtain 

-mlog^ina^e) - a + cr^tr (b'b) + tr (x'H'b'bHx) - -f\\x\\l - C + 7r = 

But to solve such equation, we can separate the constant part and the variable part to obtain 

x'iH'b'bH - -fl)x = 
^ ^ (10) 

-mlog^ina'^e) - a + ahr (b'b) - C + 7r = 
To satisfy the first equation of (10), we need to consider two cases. For any x in the support 
of P, it suffices to have x as an eigenvector of H'b'bH with its corresponding eigenvalue 
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equal to 7. This implies that we can take b such that H'h'hH represents a projection matrix 
such that its column space denotes the set of all x in the support of Po- For any x not in 
the support, it suffices to have x'{H'h'hH — '~fl)x < 0. Let t denote the dimension of the 
space that is spanned by the vectors in support. Let Mf denote a diagonal m x m matrix 
with the first t elements equal to 1, and the rest of them 0. Thus, it suffices to take b 
such that b'b — ^{HR'^Mt + — M^), where (•)^ denotes the pseudo-inverse operator and 
/3 > is selected later to satisfy the above requirements. Note that t is smaller or equal 
to the rank of H, since if some x is in the null space of H , it can not be in the support. 
Thus, b'b is a full-rank positive definite matrix. This implies that the density function of 
PqWq^ with respect to the Lebesgue measure is Gaussian with zero mean and covariance 
^^^(^-f-'^HH'Mt + /3~'^{I - Mt)y But, to have such Gaussian density at the output, it 
suffices to have a Gaussian input with zero mean and covariance such that 

Thus, we need to pick S^i 7, /3, and t to satisfy the above requirement. By Kuhn-Tucker 

condition we need to have 7(J ||a;||2(iPo — L) = 0. Since, 7 can not be zero, we need to 
find 7 in the late equation with the constraint tr {Sx) = L. As the result, it suffices to take 
/3 = and to take 

rlr>cr_ p _ _ 

-I - a\H'H) 



7 

where [•]+ ceils negative eigenvalues to 0. It just remains to select t, 7 such that tr (S) — F. 
We emphasize that t < rank(^). Thus, to find the solution we first pick t — rank(^) and 
search for the value of 7. If we find the solution, we stop. Otherwise we reduce t by 1 and 

repeat the procedure till we find the solution. 

Picking 7 and S as explained would result to obtain 

a = -mlog2(7r) - logdet i^^^HH'Mt + (7^(7 - Mt)). 

7 

Furthermore, we would have a^tr = mlog2e — 7r. Thus, substituting and satisfying 
the equality in the second equation of (10) would result to 

C = log2 dei{I + ^HSH'), 



where S is chosen as explained. For the sake of convenience in presentation, let = 



log2 e ' 
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then we can simplify determination of S to 

1 



S 



where fi is selected such that tr (S) — F. 

The uniqueness property follows by Lemma 3.2 and the fact that that there exists no 
other choice for Po to yield to the same Gaussian density function on the output. □ 



4 Full Channel State Information at the Receiver 

In mobile communications, sometimes it is possible to obtain an estimation of the channel 
realization at the receiver. This is specifically true in block-fading channels with a large 
coherence time, i.e., where there is sufficient delay between changes in channel realizations. 
In these systems, the transmitter assigns a portion of each block for training, where it sends 
some known signals to the receiver, so that the receiver obtains an estimate of the current 
channel realization. This process, called as channel estimation, allows the receiver to obtain 
some information about the channel state from none to (asymptotically) full channel state 
information (CSI). 

In this section, we assume that full CSI is available at the receiver. Considering the 
general linear statistical model (1), this means that the channel realization, H^, changes 
through the time but it is perfectly known at the receiver. We assume that the channel 
is Rayleigh or Rician faded [24], [5]. This means that the channel realization changes in 
accordance of a probability measure with a Gaussian density function with respect to the 
Lebesgue measure. If the density function is centralized (zero mean), the fading is Rayleigh; 
otherwise, it is Rician. In either case, the channel statistical law is fully characterized by the 
line-of-sight (mean) and scattering component (covariance matrix) of the channel realization 
[24], [5]. 

Let vec (■) denotes the vector operator that concatenates the columns of an m x n matrix, 
respectively, into an mn x 1 vector. Thus, for every channel realization H, we denote its 
vector form as vec if, which is a multivariate random vector in C"*" that is characterized by 
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the mean value vec H and the spatial covariance matrix 

E = GOV (vec if, vec H) . 

To emphasize that the channel state information is fully known, we use a subscript "F" . As 
a result, we denote the channel as Wf{H, E). Since the channel realization is known at the 
receiver, there exists no advantage in taking L > 1 in the capacity analysis. Hence, likewise 
the previous section, we assume L = 1. 

Since the channel realization is fully known at the receiver, we assume that V — C"*^" is 
the space of state information and the conditional probability measure on iif is a point 
mass measure ak, H — v. Since the channel is Rayleigh or Rician fading, the probability 
measure on V , i.e., i?, is absolutely continuous with respect to the Lebesgue measure with 
the Gaussian density function 



rf/iy TT™" det E 

By definition of the auxiliary measure T (3), one can inspect that W^q^(-|2;) <C T for 
every v and x, where its density function is obtained from (4) as 

f o''^\\y\\l-\\y-vA\l 

fT,QM^) = J fT{y\x,H)dQ, = f3e ^ . (12) 

For every P e ^g^r{X), it can be verified that PWq^ -C T with the density function 



/ c.'^\\y\\^-\\v-v^\\2 

fT,P,QM = (3 J e ^ dP (13) 

As a result, the mutual information of this channel can be expressed as 

I{P,Wf{H,^)) =111 fT,QM^)^Og,fT,QMdTdPdR 



-II 



fT,P,Q. (y) log2 /t,p,q. {y)dTdR. (14) 



4.1 Properties of mutual information 

In this subsection, we address properties such as strict concavity and continuity of the 
mutual information, which are used in capacity analysis of fading channels with full CSI at 
the receiver. 

In Proposition 3.3 of Part I, wc have addressed the strict concavity of the mutual informa- 
tion. Since this property is essential to addressing the uniqueness of the capacity- achieving 
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probability measure, we address a simpler condition to verify the strict concavity of the 
mutual information for fading channels with full CSI at the receiver. Using the following 

observation, we address this issue. 

Observation 4.1. For every P e ^g;r{X), fT,p,QSy) continuous and nonzero overY xV . 

Proof. By (12), the continuity and positiveness of fT,Qy{y\x) are obvious. The positiveness 
of fT,Q^iy\x) implies the positiveness of fT,p,Qyiy)- To prove the continuity, suppose that 
(l/n; ^n) ~^ {v -i 'v) the Euclidcan norm. Then, 



We define that two probability measures Pi, P2 G ^g;r{X) are equivalent over Wf{H., E), 
if they induce the same conditional output probability measure (conditional on v). Equiv- 
alently, two input measures are equivalent over Wp{H, E) if /r,Pi,Q„(y) = fT,P2,Qvhj) 
(y, f ) G F X y such that v is in the support of R. Note that for full-rank E, all v e are 
in the support of R. 

Proposition 4.1 (Strict concavity). The mutual information of channel Wf{H,T) is 
strictly concave with respect to the convex combination of two input measures Pi and P2, 
unless they are equivalent on H/V(/f, E). 

Proof. By Observation 4.1, if there exists {y,v) & Y xV such that fT,Pi,Qyiy) 7^ /T,P2,Q„(y) 7^ 
then there exists a neighborhood U(^y^y) G Y xV of {y, v) with that property. By definition 
of T (3) and R (11), (T x R){U(^y^v)) > 0. This implies that the set U(^y^y) comphes the 
requirements of Proposition 3.3 of Part I. Thus, the mutual information function of an FCSI 
channel is strictly concave with respect to the convex combination of Pi and P2 if and only 
if there exists {y,v) eY xV such that /r,Pi,Q„(z/) 7^ /r,P2,Q.(?/)- □ 




dP 



fT,p,QSy)- 



This proves the continuity of fT,p,Qv{y) over Y xV. 



□ 
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Another important property of the mutual information in capacity analysis is its weak* 
continuity, which is stated as follows. 

Proposition 4.2 (Continuity). The mutual information of any fading channel Wf{H,T,) 
is weak* continuous over ^g^ri^). 

Proof. It suffices to check if the hypothesis of Theorem 3.3 of Part I is satisfied. We prove 
the theorem for g{x) — \\x\\l, and the proof for other choices of g is also similar. To prove 
hypothesis (a) of Theorem 3.3 of Part I, we proceed as the proof of Proposition 3.2. 
Let assume that a < 1 and p < 2 are fixed. Let define 

= {(x,y,v) eXxYx y|/T,Q.(y|x) |log2 /T.Q^yk)! > c"}- 

Let also define 

= {{x,y,v) eXxYx V\fT,QM^) > c}- 
For sufficiently large c > 0, e.g. c > 10, it can be observed that C Bf.. But fT,Qy{y\x) > c 
is equivalent to Q;^||y||2 — \\y — vx\\l > In ^. Hence, if we define 

= {(x, y,v)^XxYx V\o?\\yf^ > a' In p, 

we can deduce that for sufficiently large c > 0, we would have A^ Q Be Q E^. Let uj{c) — 
(^ln|j .We have, 

/// /T,Q„(?/k)|log2/T,Q„(y|a;)|dTdPdi?< /// /T,Q„(?/|a:)|log2/T,Q„(|/k)|ciTdPdi? 

J J J Ac J J JEc 

, rrr i nv-^^nj 

< Il0g2/^l /// —-^e-^^dpLydPdR 



Ec 



7r'"0" 



< . ' |log2/^l I log2e 



"^''^ e ^daydPdR 
|log2/3| , logae , 2 , / / n 112 




+ ^T^^. (^^' + / / \\vx\\idPdR) 



^ / |log2/3| ^ logac , 2 , r / II ||2 ,m 
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Since (mcr^ + ^ / ||i'|pdit!) < oo and a;(c) — > oo as c — > oo, we can deduce that 




"^'^^ Pe^g,T{x) J J J A, 
This imphes that hypothesis (a) of Theorem 3.3 of Part I holds. 

To verify if hypothesis (b) holds, for fixed y and v, let = {x & X\fT,Q^{y\x) > c}. 

Then, 



Thus, both hypotheses of Theorem 3.3 of Part I hold. Hence, the mutual information is 



4.2 Capacity analysis 

In this subsection, we address issues on the existence, the uniqueness, and the characteriza- 
tion of the capacity- achieving measure for fading channels with full CSI at the receiver. 

Lemma 4.1 (Existence). For any channel Wf{H, E) and the set of input measures ^g^r{X), 
there exists a measure Po G ^g^ri^) that achieves the capacity o/VFf(^, E). 

Proof. By Proposition 4.2, the mutual information is continuous over J^^g^lX). Since 
^g^r{X) is weak* compact, the existence is guaranteed by Proposition 4.1 of Part I. □ 

One immediate result of our arguments on strict concavity of mutual information. Propo- 
sition 4.1, is on the uniqueness of the capacity-achieving measure, as shown below. 

Lemma 4.2 (Uniqueness). For any fading channelW p{H ^ S) and the set of input measures 
^g^r{^), the capacity- achieving measure is unique up to the equivalency of input measures. 

Proof. The proof is similar to the proof of Lemma 3.2. □ 




weak* continuous. 



□ 
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We remark that using Lemma 4.2 with some simple intuitive arguments, one can justify 
that the capacity-achieving input measure is symmetric. 

So far, in this section, we have addressed issues on the existence and the uniqueness of 
the capacity-achieving measure over fading channels with full CSI at the receiver. It remains 
to provide some insight to the characterization of the capacity-achieving measure. 

Proposition 4.3. Let g{-) = \\ ■ \\^ for I <r}. Ifr)>2, the capacity- achieving measure has a 
bounded support with no interior point. In contrast, if 1 < r] < 2, then a necessary condition 
for Po is that for almost every v &V, sup^>o e~''^Po(||y — vx^2 < to) — 0(e~"*^^)ll^ll'') for 
sufficiently large y in the column space of v and for some positive function a :V ^ . 

Proof. Let PoWq^ denote the optimal conditional capacity- achieving output measure. Ap- 
plying Kuhn-Tucker condition. Theorem 4.3 of Part I, to our problem, we need to find a 

positive value 7 > such that 

D{WQMx)\\PoWQjdR - ^Wxr, < c - 7r. 

Using some straightforward mathematical manipulation, this results to 



/ 



-m\og2(7ra'e) - J J j—^e logs /p„,Q,(?/)d//ydi? - tI^II^ < - 7^ (15) 

The problem is now finding an output density function together with the value 7 > 
such that the above inequality is satisfied with equality on all x e X in the support of the 
capacity- achieving measure. 

It can be inspected that for large values of x the term 7||x||J5 is a dominant term. Hence, 
the integral term must have a growth rate equal or smaller than 7||a;||^. As a result, it suffices 
to study the asymptotic behavior (tail) of the density function fp„,Q^{y) = 

Suppose y & Y and v & V are fixed and y is in the column space of v. Let ky{y) — 
supr e~^^Po(||y ~ vx\\2 < ra). Similar to the proof of Proposition (3.3), we can deduce that 
for y in the column space of v, 

-log2/p„,Q„(2/) = 0{m.m{-\og2K{y),\\y\\l)). 

Now, let consider this together with (15). It can be verified that for > 2, there exists 
no choice for the input measure to catch up with the growth rate of \\x\\^ for large values 
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of X. This implies that the support of the input measure is bounded for 77 > 2. One can 
verify that the hypothesis of Proposition 4.3 of Part I holds for fading channels with full 
CSI. More specifically, the function p{z) is analytic on Z except possibly at 2; = 0. As a 
result, we deduce that Sx{Po) can not have any interior point. On the other hand, for f] <2, 
a necessary condition for the input measure is k{y) — 0(e~"*^^)ll^ll'') for sufficiently large y in 
the column space of v where a :V ^ □ 

The capacity-achieving measures for the general choice of g{-) — \\ ■ are not known. 
However, for the case that rj — 2, this problem was first addressed in [8] and solved for i.i.d 
Rayleigh distribution. Telatar [8] showed that the capacity-achieving input distribution is 
an isotropic Gaussian distribution. Foschini [7] has shown similar results also. 

Here, we want to address the capacity of the channel in the presence of arbitrary cor- 
relation and line-of-sight components. For this purpose, we use Kuhn- Tucker condition. 
Theorem 4.3 of Part I. 

Theorem 4.1. Let g{-) — \\ ■ |||. The probability measure in ^g^r{X) that achieves the ca- 
pacity of the channel Wf{H, E) is absolutely continuous with respect to the Lebsegue measure 
with a unique ( a. e.) Gaussian density function of zero mean and covariance matrix S that 
satisfies 

J x'v'{vSv' + a'^Im)~^vxdR <A*||a;||2, "ix & X 

where the equality occurs if and only if x is in the support of capacity-achieving measure and 
H > is selected to satisfy tr (S) = T. Moreover, the capacity of this channel is 

C = y" logdet {Im + l/a\Sv')dR. 

Proof. Suppose PoWq^ is the optimal capacity- achieving output measure. Applying Kuhn- 
Tucker condition. Theorem 4.3 of Part I, to our problem, we need to find a positive value 
7 > such that 

j D{WQMx)\\PoWQjdR - ^\\x\\l < c - 7r. 



21 



Using some straightforward mathematical manipulation, this results to 




diiydR — 7||2;||2 <C — '-yV. 



The problem is now finding an output density function together with the value 7 > 
such that the above inequality is satisfied with equality on all x e X in the support of the 
capacity-achieving measure. 

Suppose ,T G X be in the support of the optimizing input measure. Then, we need to have 
the integration in the above inequality result into a quadratic form. To obtain a quadratic 
form out of integral, an straightforward option is to assume logs '^'''^^^'^"^ = cl{v) — \\b{v)y\W, 
where a{v) is a function and h{v) is a mapping from V to the M^(R). As the result, we will 
obtain 



-mlog2(7r(7^e) + / [-a{v) + ahs: {h{v)'h{v)) + x'v'h{v)'h{v)vx\dR - -i\\x\\l - C + 7r = 



But to solve such equation, we can separate the constant part and the variable part to obtain 



To satisfy the first equation of (16), we proceed as follows. For any x in the support of Po, 
it suffices to have x as an eigenvector of E {v'h{y)'h{y)v) with its corresponding eigenvalue 
equal to 7. For any x not in the support of Po, "we must have x'E {v'h{v)'h{v)v) x — 7||a;||2 < 0. 
This implies that we should select h{y) such that the maximal eigenvalues of E {y'h{yyh{y)v) 
(that correspond to the support of Po be 7) and the rest of its eigenvalues be less than 7. 
One immediate approach to select h{v) is to assume that the input measure is a centralized 
multivariate normal with covariance -S". As a result, this implies that h{v)'h{v) = log2 e {vSv'+ 
a'^Im)~^- Therefore, it remains just to pick a semi-positive definite matrix S such that 
log2 e E {v'{vSv' + a'^Im)~^v) have our desired structure. That is we need to find S such that 



in consideration of other constraints raising from (16). It can be inspected that such choice 
for S exists and depends on the value of 7. By Kuhn-Tucker condition we need to have 
7(/ ||x|||dPo — r) = 0. Since 7 can not be zero, we need to find 7 such that tr (S) — F. For 
convenience in presentation, let define n = p 2_ Now, multiplying the above equation from 




x' v'h{v)'h{y)vxdR — 7||a;||| = 

m log^i'Ka'^e) + j[-a{v) + uHr {h{v)'h{v))\dR - C + 7F = 



(16) 



log2 e E {v'{vSv' + a-'^ImY^v) - 7/ < 
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left by x' and right by x, and taking the expectation with respect to x, one can verify that 
we obtain the equahty 

j ahx {{vSv' + a'^Im)'^) dR - m\og^ e + /xF - (17) 



Since we have 

a{v) — — mlogaTT — - log det {vSv' + a^Im)- 

Substituting in (16), we obtain 

y log2 det [Ira + 1 /(T'^vSv')dR -C + J (J^ti {b{vyb{v)) dR-m logg e + /xF = 

In consideration of (17), we would obtain 

C = j log2 det {Im + l/a\Sv')dR. 

The uniqueness property follows by from Lemma 4.2 and the fact that there exists no other 
input measure than can induce the same conditional output measure with Gaussian density 
function. □ 

Note that Theorem 4.1 characterizes the capacity- achieving measure for any Rician 
MIMO channel with full CSI at the receiver. In general, a closed form solution for S is 
not obtainable from Theorem 4.1, and S should be found through exhaustive computer 
search. However, for special cases of channels, one may use some novel approaches to solve 
the conditions in Theorem 4.1. As an example, consider the following corollary. 

Corollary 4.1. IfWF{H, E) is an i.i.d. Rayleigh channel, i.e., H = and E = /, then the 
capacity- achieving measure has an isotropic Gaussian density function with zero mean and 
covariance matrix S — -I. 

n 

Proof. Suppose S is the covariance matrix that satisfies the condition in Theorem 4.1. Let U 
be an n X n unitary matrix. Since R is invariant under the operation of U, it can be inspected 
that USU' is also a right candidate. By uniqueness of the capacity-achieving measure, we 
deduce that S = U SU' for every unitary matrix U. This implies that S — XI for some A > 0. 
Since tr (S) = T, we deduce that S = □ 

Recall that this is the same result as given in [8] for Rayleigh channels. One can verify 
that if S is not singular, then for large signal-to- noise ratio (SNR), i.e. the covariance 
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matrix S would be very close to Hence, for large SNR, an input measure with isotropic 
Gaussian distribution is near optimal. Relevant work can be found in [9], [25], and [26]. 

5 No Channel State Information at the Receiver 

In some applications, the system setup does not allow any estimation of the channel real- 
ization at the receiver. This is specifically true in fast fading channels, where there is not 
sufficient delay available between changes in the channel realization. 

In this section, we consider Rayleigh or Rician fading channels where no CSI is available 
at the receiver. Regarding the general linear statistical model as (1), this means that the 
channel realization, H^, changes through time in accordance of a Gaussian probability density 
function, but the realization is not known at the receiver. As in the previous section, we 
characterize these channels with the mean value H and the covariance matrix E of the fading. 
To emphasize that CSI is not known, we use a subscript 'W, and denote the channel as 



Since we assumed that no CSI is available at the receiver, the space of side information, 
V, is statistically independent from H. In the framework of the generic system model 
(Section 3), this is equivalent to assume that an arbitrary Borel measurable space V with 
an arbitrary measure R on it such for every given v, the conditional probability measure on 
H, Qy, has a Gaussian density function characterized by H and E. Since the measure is 
not dependent on v, we drop indexing by v and simply use Q, instead. 

Since the channel realization is not known at the receiver, the size of block length L is 
important in capacity analysis of these channels. Hence, we assume X = C"^^ for a general 
L > 1. By definition of the auxiliary measure T (3), one can inspect that VFQ(-|a;) <C T for 
every x with the density function (obtained from (4)) 



Wn{H, E). 



fT,Q{y\x) 



exp 



( 



a^llylll — vec {y — Hx)'^^\ec {y — Hx) 



(18) 



det 



where = ^{x' ® Im)^{x ® /^) + ImL- For every P e ^g,r{X), let 




(19) 
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denote the output density function. The mutual information of this channel is expressed as 

I{P,W^{H,E)) = JJ fT,Q{y\x)\og,fT,Q{y\x)dTdP- J fT,P,Q{y)log, fT,P,Q{y)dT. (20) 

One should note that the maximum of (20) should be divided by L to determine the capacity 
per channel use. 

5.1 Properties of mutual information 

In this subsection, we address properties such as strict concavity and continuity of the mutual 
information for fading channels with no CSI at the receiver. 

Propositions 3.3 of Part I addressed the strict concavity of the mutual information in 
general. Since the strictness property is essential to addressing the uniqueness of the capacity- 
achieving probability measure, we address a simpler condition for the case of fading channels 
with no CSI at the receiver. 

Observation 5.1. For every Pe J^g^r{^), fT,p,Q{y) is continuous and nonzero overY. 

Proof. By (18), the continuity and positiveness of fT,Q{y\x) is obvious. The positiveness of 
fT,Q{y\x) implies the positiveness of fT,p,Q{y)- To prove the continuity, suppose that yn ^ y 
in the Euclidean norm. Then, 

lim/T,P,Q(y„) =lim / fQ{y\x)dP 

n n J 

n det^x J \ (y J 

ByDCT =^e^ / exp ( ' H^y^'' ~ dP 
det$^ 7 V ) 

= fT,p,Q{y)- 

This proves the continuity of fT,p,Q{y) over Y. □ 

We define that two probabihty measures Pi, P2 G ^g^r{X) are equivalent over Wn{H, E) 
if they induce the same output probability measure. Equivalently, two input measures are 
equivalent over Wn{H, E) if fT^p^qiy) = fT,P2,Q{y) for all y eY. 

Proposition 5.1 (Strict concavity). The mutual information of channel Wn{H,T,) is 
strictly concave with respect to the convex combination of two input measures Pi and P2, 
unless they are equivalent. 
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Proof. By Observation 5.1, if there exists y & Y such that fT,Pi,Q{y) /r,P2,Q(y) ^^^^ 
there exists a neighborhood Uy C Y oi y with that property. Then by definition of T 
(3), T{Uy) > 0. This impUes that the set Uy comphes the requirements of Proposition 
3.3 of Part I. Thus, the mutual information of an NCSI channel is strictly concave with 
respect to the convex combination of Pi and P2 if and only if there exists y eY such that 
fT,Pi,Q{y) 7^ fT,P2,Q{y)- By definition of equivalency of measures over Wn{H, E), we deduce 
the assertion. □ 

In the following, we state and prove another important property of mutual information, 
weak* continuity, which will be used later in the capacity analysis of fading channels with 
no CSI. 

Proposition 5.2 (Continuity). The mutual information of any fading channel Wn{H, E) 
is weak* continuous over ^gj^{X). 

Proof. It suffices to check if the hypothesis of Theorem 3.3 of Part I is satisfied. We prove 
the theorem for g{x) = and the proof for other choices of g is also similar. To prove 

hypothesis (a) of Theorem 3.3 of Part I, we proceed as the proof of Proposition 3.2. For 
every Pe ^g,r(^), let 

Ac = {ix,y) eXx Y\fT,Qiy\x) |log2 /t,q(2/|x)| > c'}. 

Let 

B,^ {{x,y) e X xY\fT,Q{y\x) > c}, 

for sufficiently large c > 0, it can be observed that Ac C Be- Let 

- - c det $ 
Dc^{ix,y) eX X Y\a'^\\y\f^-vec{y-Hxy^;\ec{y-Hx) > aHn — ^}. 

It can be inspected that Be C Dc- We also note that or sufficiently large c > 0, a^H^/Hf > 

vec (y — Hxy^~^vec {y — Hx). Let also define 

„ , „ (j^ , cdet<I>a., 

Ec = {{x,y) eXx Y\\\yf, > — In— — ^}. 

Thus, for sufficiently large c > 0, Dc C Ec- Noting that det^j; > 1, let define a;(c) = 
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/ 2 

( ^ In I j . In a similar discussion as in of Proposition 3.2, we can deduce that 

// fT,Q{y\x)\\0g^fT,Q{y\^)\ dTdP 

J JAr 



< 



ff fT,Q{y\^)\log,fT,Q{y\x)\dTdP 

J Jec 



< |log2/3| // r . \ . ^ e-^"'^'^^^-^^)'^-'^^^(^-^^)d/^ydP 



-X 



< 



(y'^ J Je, n'^^a^"'^ det $^ 
|log2/5| ^ 2a2log2e 



c^2(c) a2a;2-p(c) 



I II 2 



If we take the suppg ^(^x) '^^ both sides, the second term of the RHS is a finite value. Now, 
applying limc_>oo to both sides, we observe that a;(c) — > oo as c — > oo. Hence, 

lim sup ff fT,Q{y\x)\\og^fT,Q{y\x)\dTdP^O. 

This imphes that the hypothesis (a) of Theorem 3.3 of Part I holds. To verify if the hypothesis 
(b) holds, the proof is by Chebychev's inequahty which is essentially similar to the proof 
of Proposition 4.2. Thus, both hypotheses of Theorem 3.3 of Part I hold, so the mutual 
information is weak* continuous. □ 



5.2 Capacity analysis 

In this subsection, we address issues on the existence, the uniqueness, and the characteriza- 
tion of the capacity- achieving measures for fading channels with no CSI at the receiver. 

Lemma 5.1 (Existence). For any channel Wn{H, T,), there exists a measure Po G ^g,r(^) 
that achieves the capacity of W]y{H,'E) over I^g^ri^)- 

Proof. Proposition 5.2 states the weak* continuity of the mutual information over ^g ^lX). 
Since J^g^ri^) is weak* compact, by Proposition 4.1 of Part I, we conclude the assertion. □ 

Lemma 5.1 states the existence of the capacity- achieving measure over fading channels 
with no CSI. One immediate result of our arguments on strict concavity of mutual infor- 
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mation, Proposition 5.1, is on the uniqueness of the capacity-achieving measure, as shown 
below. 



Lemma 5.2 (Uniqueness). For any channel Wn{H,Ti), the capacity- achieving measure 
over ^g^ri^) is unique up to the equivalency of input measures. 

Proof. The proof is essentially the same as the proof of Lemma 3.2. □ 

So far, in this section, we have shown the existence and uniqueness of the capacity- 
achieving measure over fading channels with no CSI at the receiver. It remains to provide 
some insight to the characterization of such input measure. 

Proposition 5.3. Suppose g{-) — \\ ■ \\^ for 1 < 77. If r] > 2, the capacity- achieving measure 
has a hounded support with no interior point. In contrast, if 1 < rj < 2, a necessary condition 
for the capacity- achieving measure is 

PoiWvh < Wxh^ny) c n^)) = 0(e-"iW), 

where TZ{-) denote the row space of a matrix, and a > is a constant. 

Proof. Suppose PoWq is the optimal capacity- achieving output measure. Applying Kuhn- 
Tucker condition. Theorem 4.3 of Part I, to our problem, we need to find a positive value 
7 > such that 

d{Wq{-\x)\\p,Wq) - 711x11:5 <c-^r. 

We note that 14/(3(-|a;) <^ fiy with density function 

f^(.,\^\ — \ --\yec{y-Hxy^^^vec(y-Hx) 

where is defined as in (18). Using some straightforward mathematical manipulation, we 
obtain 

-mL\og,{nea') - J log, ^^^^fQ{y\x)dpiy - log2(det$,) - ^\\xr, <C-^r. (21) 

Now, the problem is to find an output density function together with the value 7 > such 
that the above inequality is satisfied with equality on all x e X in the support of the capacity- 
achieving measure. Unfortunately, because of the inherent difficulties in this expression, it is 
not possible to find an analytic solution for the output density function. However, through 
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some asymptotic analysis discussion we can obtain some intuition on characterization of the 
support of the capacity- achieving measure, as follows. 

Case 7] > 2: It can be inspected that for large values of x, the term 7||a;||Jj is a dominant 
term. Hence, the integral term must have a growth rate of \\x\\^; otherwise, the support of 
the capacity- achieving measure is bounded. Thus, it suffices to study the asymptotic (tail) 
behavior of the density function fp^^qdi) — For every fixed y we have 

^rvec (y—HxY^^^vec (y—Hx) 



fp r^iv'] = I = -^wec^^y-Hxy<s^ vec(y-Hx),p 



^--^y^c y'^^ ^vecy --|jvec {Hx)'^x ^vec {Hx) i p 

r2"^-^det 



> / 

— / ^mLg-2mL 

> g-^ll2/lli f I ^—^Yec{Hxy^^^Yec{Hx)^p 

- J ^mL^2mL(^Q^^^ ° 

This means that — log2 /p„,Qjj(y) = As a result, it can be verified that for r] > 2, 

there exists no choice for the input measure so that the integral part of (21) catch up with 
the growth rate of for large values of x. Thus, the support of the capacity- achieving 
input measure is bounded for rj > 2. Trying to use Proposition 4.3 of Part I, one can verify 
that p{z) is analytic over Z except the zero set of some polynomials, say A & Z. Since the 
zero set of polynomials are closed sets including boundary points [27], the set U = Z\A is a 
connected set. Because of the existence of log2(det in (21), we know that the zero set of 
p{z) is not the solution of a polynomial. Thus, ^{Sx{Po)) n C/ is not empty. Now, applying 
Proposition 4.3 of Part I, we deduce that Sx{Po) can not have any interior point. 

Case 7] < 2: Recalling that = ^{x' ® Im)^{x ® Im) + ImL, one can verify that for 
every a; e X, 

where Amax and Amin > are the maximum and minimum eigenvalues of S. Let the operator 
??.(•) denote the row space of a matrix. Then, 

f„ ^/.A > / \ -Jjvecy'^^Vecy -Jjvec(^a;)'*^ivec(^a;) 7p 

JpM) > J ^^i^2™Ldet$/ ^ "^^^ 

/ ^^(Aniaxlklli + a2)' 



> / \ ^-tT(y{\minx'x+a^lL)-^y') ^-tr[Hx{X^inx'x+a^lL)-^x'H') 



f k 

\\y\\2<\\x\\2,1Z{y)C'R.(x) {^max\\X\\2 -t Cr ) 
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where k — k{H, E) is a nonzero constant dependent on H and E. Assuming that 

PoiWvh < lkl|2,7^(y) c n{x)) = e(e-^(^)), 

for some positive function Z : y — > R+, where l{y) — a;(ln ||y||2), for sufficiently large 
we would obtain. 

fp.,Qiy) > e(e-^^(^)). 

But this means that 

-log, fp^,Q{y) ^ 0{imn{\\y\\ll{y))). 

Now, consider this together with (21), we can observe that for large values of the 
integral part of (21) is behaving as 0(min (||a;|||, l{x))). Thus, a necessary condition for the 
capacity achieving- measure is /(x) = r2(||a;||Jj). □ 

We remark that(21) in Proposition 5.3 provides necessary and sufficient condition for 
the capacity-achieving measure for Rician and Rayleigh channels (with full-rank covariance 
matrix) subject to any moment constraint of order r] > 1. However, it is still not possible 
to solve (21) to find the capacity-achieving measure for the general choice of 77, and this 
problem remains open for future investigations. For the case that r) — 2, this problem has 
been addressed to some extent for Rayleigh channels in [15], [16], and [20]. For the case 
of SISO Rayleigh channels, it has been shown [16] that the capacity-achieving distribution 
has a finite number of mass points. Also, for the MIMO channel with isotropic Rayleigh 
distribution, the authors [15] have conjectured that the support of the capacity- achieving 
measure is in the form of concentric spheres around the origin with no interior point. More 
results can be found in [28] . 

6 Partial Channel State Information at the Receiver 

In some fading channels, the system setup allows estimation of the channel at the receiver 
to some extent. In such cases, we assume that the channel state information is partially 
available at the receiver. Thus, in consideration of the general system model (1), we assume 
that the channel realization, H, is partially known at the receiver. We assume that the 
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channel realization is governed by a Gaussian distribution characterized by H and E (full- 
rank), as before. The channel state information is assumed to be available at the receiver 
in the form of elements from an LCH Borel- measurable space V, where each value v E V is 
an estimation for the channel realization H. We assume that H x V is associated with a 
measure Q o R which has a joint Gaussian density function. That is for every v & V, the 
measure Qy has a Gaussian density function characterized by 

llH\v ^ H + T,HvT,~^{v - Hv) 
^H\v = S — T^Hv^vv^vH I 

where T^h^ (and S^,//) denotes cross-covariance of H and f , and and S^^, are mean and 
covariance of the a Gaussian density function of the probability measure RonV. To em- 
phasize that the channel state information is partially known, we use a subscript "P", we 
denote the channel as Wp{^h\vi^h\v)- To avoid extra difficulties, we assume that T1h\v is 
full-rank for all v e y. 

Since the CSl is not fully known at the receiver, the size of block-length, L, is important 
in capacity analysis. Hence, we assume X = C"^-^. Note that Wq^{-\x) < T with a density 
function (obtained from (4)) 

= [ ) ^''^ 

where ^x,v — -^{x' <^ Im)^H\v{x <S) Im) + ImL- For every input measure P e ^g^r{X), we 
have PWq^ <^ T with a density function 

fT,P,QM = J fT,QMx)dP (23) 

As a result, the mutual information of this channel is expressed as 

I{P,Wp{^lH\v,^H\v)\R) =111 MqMx) log, fT,QMx)dTdPdR 
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fT,P,Q. (y) log2 /t,p,q. {y)dTdR. (24) 
Note that the capacity per channel use is obtained by dividing the maximum of (24) by L. 
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6.1 Properties of mutual information 

In Proposition 3.3 of Part I, we addressed the strict concavity of the mutual information 
in general. Since the strictness property is essential to addressing the uniqueness of the 
capacity- achieving probability measure, here, we address a simpler condition to verify this 
property of the mutual information function for fading channels with partial CSI at the 
receiver. 

Observation 6.1. For every P e 0^gj'{X), fT,p,Q^{y) is continuous and nonzero for all 
{y,v) eY xV. 

Proof. By (22), the continuity and positiveness of fT,Q^{y\x) is obvious. The positiveness 

of fT,Qy{y\x) implies the positiveness of fT,p,Qfj{y)- To prove the continuity, suppose that 
{yn, Vn) —>■ {y, v) in the Euclidean norm. Then, 



We say two probability measures Pi,P2 € H^g^ri^) are equivalent over Wp{fj,H\y,T,H\v) 
if they induce the same conditional output probability measure (conditional on v). Equiva- 
lently, two input measures are equivalent over Wp{iih\v, ^h\v) if fT,PuQviy) = fT,P2,Qviy) 
all {y,v) eY xV such that v is in the support of R. 

Proposition 6.1 (Strict concavity). The mutual information of channel Wp{iih\v,^h\v) 
is strictly concave with respect to the convex combination of two input measures Pi and P2, 
unless they are equivalent overWp{^H\vi^H\v)- 

Proof. By Observation 6.1, if there exists y E Y such that fT,Pi,Qy{y) 7^ fT,P2,Qv{y) then 
there exists a neighborhood CY xV oi {y,v) with that property. Then by definition 

olT X R (3), (T X R){U(^y^y)) > 0. This imphes that the set U(^y^y) comphes the requirements 




fT,p,Qy{y)- 



This proves the continuity of fT,p,Qy{y) over Y x V. 



□ 
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of Proposition 3.3 of Part I. Thus, the mutual information function of an PCSI channel is 
strictly concave with respect to the convex combination of Pi and P2 if and only if there 

exists {y,v) eY xV such that fT,PuQviy) 7^ /r,P2,Q.(l/)- D 

Another important property of mutual information which is useful in capacity analysis 
is its continuity, as shown below. 

Proposition 6.2 (Continuity). The mutual information of any channel Wp{iih\v,'^h\v) 
is weak* continuous over j3^g^r{X). 

Proof. It suffices to check if the hypothesis of Theorem 3.3 of Part I is satisfied. We prove 
the theorem for g{x) = \\x\\2, and the proof for other choices of g is also similar. To prove 
hypothesis (a) of Theorem 3.3 of Part I, we proceed as the proof of Proposition 3.2. For 
every Pe <^g,r(^), let 

Ac = {{x,y,v) eXxYx V\fT,QMx) |log2 /T,Q„(yk)| > c'}. 

Let 

= {{x, y,v)eXxY X V\fT,QM^) > c}, 
for sufficiently large c > 0, it can be observed that Ac C Be- Let 

c det $ 

Dc = {{x, y,v) e X xY X V\a'^\\y\\l - vec {y - ///f|^x)'$~J,vec {y - iih\vx) > (J^ In ^"^}- 

It can be inspected that C D^.- For sufficiently large c > 0, it can be inspected that 
o;^||y||2 — (y — /ii/|„a;)'$~^vec (y — iih\vx)- Let also define 

^ r/ X X. n 0"^, Cdet$a;i;-, 

Ec - {{x,y,v) eXxYx V\\\y\f^ > —In -^}. 

Thus, for sufficiently large c > 0, Dc Q E^. Noting that det$a;,^ > 1, let define uj{c) — 



33 



/ 2 

( ^ In 1 1 . In a similar discussion as in Proposition 3.2, we can deduce that 
fT,QM^) \\og^ fT,QM^)\dTdPdR 

< III fT,QM^)\^og,fT,QM^)\dTdPdR 




Ec 



-I I I I ^mL(^2mL (Jet $ 



2|L,I|P 



< 



Cj2(c) CT2tu2-P(c) 




I II 2 



If we take suppg^^^^^-, of both sides, the second term of the RHS is a finite value. Now, 
applying limc_>oo to both sides, we observe that a;(c) — > 00 as c — > 00. Hence, 

lim sup /// /t,q. {y\x) |log2 /t,q„ {y\x) I dTdPdR = 0. 

This imphes that the hypothesis (a) of Theorem 3.3 of Part I holds. Similar to the proof 
of Proposition 4.2, one can verify that the hypothesis (b) holds. Thus, both hypotheses of 
Theorem 3.3 of Part I hold, so the mutual information is weak* continuous. □ 

6.2 Capacity analysis 

In this subsection, we address issues on the existence, the uniqueness, and the characteriza- 
tion of the capacity-achieving measures for fading channels with partial CSI at the receiver. 

Lemma 6.1 (Existence). For every channel Wp{fiH\v,^H\v) , there exists a measure Pg G 
£^g^r{^) that achieves the capacity of Wp{ij,h\v,^h\v) over ^gj^^X). 

Proof. Proposition 5.2 states the weak* continuity of the mutual information over ^^g^Y^^)- 
Since ^g;r{X) is weak* compact, by Proposition 4.1 of Part I, we conclude the assertion. □ 

Lemma 6.1 states the existence of the capacity- achieving measure over fading channels 
with partial CSI. One immediate result of strict concavity of mutual information. Proposition 
6.1, is on the uniqueness of the capacity-achieving measure, as shown below. 
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Lemma 6.2 (Uniqueness). For every channel Wp{iih\v, '^h\v)) the capacity- achieving mea- 
sure over ^g^r{^) is unique up to the equivalency of input measures. 

Proof. The proof is essentially the same as the proof of Lemma 3.2. □ 

So far, in this section, we have shown the existence and the uniqueness of the capacity- 
achieving measure over fading channels with PCSI at the receiver. It remains to provide 
some insight to the characterization of such input measure. 

Proposition 6.3. Suppose g{-) — || • ||^ for 1 < 77. If r] > 2, the capacity- achieving measure 
has a hounded support with no interior point. In contrast, if 1 < r] < 2, a necessary condition 
for the capacity- achieving measure is 

PoiWvh < Xmin{v)\\x\\2,n{y) C 7^(x)) = O (e'^^^) ) , 

where TZ{-) denotes the row space of a matrix, Xmin{v) denotes the minimum eigenvalue of 
T,H\y, and a : V ^ M"*" is a positive function. 

Proof. Suppose PoWq^ is the optimal capacity- achieving output measure. Applying Kuhn- 

Tuckcr condition. Theorem 4.3 of Part I, to our problem, we need to find a positive value 
7 > such that 

J D{WQMx)\\PoWQjdR - 7||x||;5 < c - 7r. 

We note that Wq^{-\x) <C /ly with density function 

where ^x,v is defined as in (22). Using some straightforward mathematical manipulation, we 
obtain 

-mL\og,{7rea') JJ log, ^i^^fQ^(y\x)di,ydR 

- J iog2(det$,,„)di?-7||x||;; < c-7r. (25) 

Now, the problem is to find an output density function together with the value 7 > such 
that the above inequality is satisfied with equality on all a; G X in the support of the capacity- 
achieving measure. Unfortunately, because of the inherent difficulties in this expression, it is 
not possible to find an analytic solution for the output density function. However, through 
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some asymptotic analysis discussion we can obtain some intuition on characterization of the 
support of the capacity- achieving measure, as follows. 

Case 1] > 2: It can be inspected that for large values of x, the term 7||a;||Jj is a dominant 
term. Hence, the first integral term in (25) must have a growth rate of \\x\\^; otherwise, 
the support of the capacity- achieving measure is bounded. Thus, it suffices to study the 
asymptotic (tail) behavior of the density function fp„,Q^(y) — For every fixed 

y e y, we have 

f„ ^ (n\ — I ^ „--;^yec{y-iJ.H^^xy^-^^vec{y-fj.H\v^)^p 

JPo,Q.{y) - J 7r-^^2mLdet$,,„ 

- J ^mL^2mL^Q^ ^ « 

> g-^ll2/|li / ]. g--^vec(///f|„i;)'*i",lvec(///f|^i;)^p 

J 7r™^a2™-f'det <l>:r,„ " 
This means that — log2 /po,q„ (y) = 0(||y||2). As a result, it can be verified that for rj > 2, 

there exists no choice for the input measure so that the growth rate of the first integral of 

(25) catch up with the growth rate of \\x\\^ for large values of x. Thus, the support of the 

capacity- achieving input measure is bounded for r) > 2. Using an argument similar to the 

one in the proof of Proposition 5.3, one can deduce that the support of Pq has no interior 

point. 

Case T] <2: Recalling that ^x,v — -^{x' ® Im)TiH\v{x ® Im) +ImL, one can verify that for 
every x & X, 

{^^^X'X + h) ®Im< < (^^Ikll^ + l)ImL, 

where Xuiayiiv) and \rmn{v) > are the maximum and minimum eigenvalues of T1h\v Let the 
operator TZ{-) denote the row space of a matrix. Then, 

f _ > / \ p-^vec2/'*;;,ivec2/ -^vec((U;f|^a;)'*~i,vec(/ijf|„a;) 7p 

r p-tr(2/(An,inMa:'a;+CT27i)-ij/') s_i , , \ 

y. I -i'^\^lH\vX{^m.ir!L{v)x'x+a^IL) ^ x'/i'^ |^ j , p 

- J ;r""'(A„„('t>)||i|ll + <r2)">'^ 

> / '^t") ^ .p„. 

J\\yh<>^miu{v)\\x\\2,-R{y)c,n{x) (Amax(t')||a;||i + (J^)""^ 

where k{v) = k{fiH\v, ^h\v) is a nonzero function from V to R"^. Assuming that 
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for some positive function Z : y — > ]R+, a : V ^ where l{y) — a;(ln ||y||2), we would 
obtain 

fpo,QAy) > e(e-2^(^)'^W). 

But this means that 

-log2/p„,Q„(y) = 0{mm{\\y\\lJ{y)a{v))). 

Now, consider this together with (25), we can observe that for large values of the 
integral part of (25) is behaving as 0(min (||a;|||, l{x))). Thus, a necessary condition for the 
capacity achieving- measure is l{x) = fl{\\x\\^). This concludes the assertion. □ 

We remark that (25) in Proposition 6.3 provides necessary and sufficient conditions for 
the capacity-achieving measure of Rician or Rayleigh channels (with full-rank covariance 
matrix) subject to any moment constraint r] > 1. However, it is still not possible to solve 
(25) and determine the capacity- achieving measures. Hence, this problem remains open for 
future investigations. 

7 Conclusion 

In this part, we addressed a unified approach toward capacity analysis of multiple antenna 
channels. We used the results of the Part I to analyze the capacity of multiple antenna 
channels in a unified manner, irrespective of the type of fading, amount of correlation, and the 
amount of available knowledge about the channel state information (CSl) at the receiver. We 
studied the mutual information function and some of its analytical properties such as strict 
concavity and continuity for additive white Gaussian (AWGN) channels, fading channels 
with full CSI at the receiver, fading channels with no CSI, and fading channels with partial 
CSI at the receiver. Then, for each type of channels we studied the capacity value as well 
as issues such as the existence, uniqueness, and characterization of the capacity-achieving 
measures. 

For channels with no CSI or partial CSI at the receiver, we provided necessary and suffi- 
cient conditions for characterization of the capacity-achieving measures and used asymptotic 
analysis to characterize the tail behavior of these measures. However, a closed form expres- 
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sion or full characterization of these measures remain open for future investigations. As 
a direction for future research, one might consider the problem of characterization of the 
capacity-achieving measure for channels with no CSI or partial CSl at the receiver. 
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